50 research outputs found

    Data-driven approaches for predicting asthma attacks in adults in primary care

    Get PDF
    Background Asthma attacks cause approximately 270 hospitalisations and four deaths per day in the United Kingdom (UK). Previous attempts to construct data-driven risk prediction models of asthma attacks have lacked clinical utility: either producing inaccurate predictions or requiring patient data which are not cost-effective to collect on a large scale (such as electronic monitoring device data). Electronic Health Record (EHR) use throughout the UK enables researchers to harness comprehensive and panoramic patient data, but their cleaning and pre-processing requires sophisticated empirical experimentation and data analytics approaches. My objectives were to appraise the previously utilised methods in asthma attack risk prediction modelling for feature extraction, model development, and model selection, and to train and test a model in Scottish EHRs. Methods In this thesis, I used a Scottish longitudinal primary care EHR dataset with linked secondary care records, to investigate the optimisation of an asthma attack risk prediction model. To inform the model, I refined methods for estimation of asthma medication adherence from EHRs, compared model training data enrichment procedures, and evaluated measures for validating model performance. After conducting a critical appraisal of the methods employed in the literature, I trained and tested four statistical learning algorithms for prediction in the next four weeks, i.e. logistic regression, naïve Bayes classification, random forests, and extreme gradient boosting, and validated model performance in an unseen hold-out dataset. Training data enrichment methods were compared across all algorithms to establish whether the sensitivity of estimating relatively uncommon event incidence, such as asthma attacks in the general asthma population, could be improved. Secondary event horizons were also examined, such as prediction in the next six months. Empirical experimentation established the balanced accuracy to be the most appropriate prediction model performance measure, and the calibration between estimated and observed risk was additionally assessed using the Area Under the Receiver-Operator Curve (AUC). Results Data were available for over 670,000 individuals, followed for up to 17 years (177,306 person-years in total). Binary prediction of asthma attacks in the following four-week period resulted in 1,203,476 data samples, of which 1% contained one or more attacks (12,193 total attacks). In the preliminary model selection phase, the random forest algorithm provided the best balance between accuracy in those with asthma attacks (sensitivity) and in those predicted to have attacks (positive predictive value) in the following four weeks. In an unseen data partition, the final random forest model, with optimised hyper-parameters, achieved an AUC of 0.91, and a balanced accuracy of 73.6% after the application of an optimised decision threshold. Accurate predictions were made for a median of 99.6% of those who did not go on to have attacks (specificity). As expected with rare event predictions, the sensitivity was lower at 47.7%, but this was well balanced with the positive predictive value of 48.9%. Furthermore, several of the secondary models, including predicting asthma attacks in the following 12 weeks, achieved state-of-the-art performance and still had high potential clinical utility. Conclusions I successfully developed an EHR-based model for predicting asthma attacks in the next four weeks. Accurately predicting asthma attacks occurrence may facilitate closer monitoring to ensure that preventative therapy is adequately managing symptoms, reinforce the need to keep abreast of triggers, and allow rescue treatments to be administered quickly when necessary

    Predicting asthma attacks in primary care: protocol for developing a machine learning-based prediction model

    Get PDF
    INTRODUCTION: Asthma is a long-term condition with rapid onset worsening of symptoms ('attacks') which can be unpredictable and may prove fatal. Models predicting asthma attacks require high sensitivity to minimise mortality risk, and high specificity to avoid unnecessary prescribing of preventative medications that carry an associated risk of adverse events. We aim to create a risk score to predict asthma attacks in primary care using a statistical learning approach trained on routinely collected electronic health record data. // METHODS AND ANALYSIS: We will employ machine-learning classifiers (naïve Bayes, support vector machines, and random forests) to create an asthma attack risk prediction model, using the Asthma Learning Health System (ALHS) study patient registry comprising 500 000 individuals across 75 Scottish general practices, with linked longitudinal primary care prescribing records, primary care Read codes, accident and emergency records, hospital admissions and deaths. Models will be compared on a partition of the dataset reserved for validation, and the final model will be tested in both an unseen partition of the derivation dataset and an external dataset from the Seasonal Influenza Vaccination Effectiveness II (SIVE II) study. // ETHICS AND DISSEMINATION: Permissions for the ALHS project were obtained from the South East Scotland Research Ethics Committee 02 [16/SS/0130] and the Public Benefit and Privacy Panel for Health and Social Care (1516-0489). Permissions for the SIVE II project were obtained from the Privacy Advisory Committee (National Services NHS Scotland) [68/14] and the National Research Ethics Committee West Midlands-Edgbaston [15/WM/0035]. The subsequent research paper will be submitted for publication to a peer-reviewed journal and code scripts used for all components of the data cleaning, compiling, and analysis will be made available in the open source GitHub website (https://github.com/hollytibble)

    Measuring and reporting treatment adherence:what can we learn by comparing two respiratory conditions?

    Get PDF
    Medication non-adherence, defined as any deviation from the regimen recommended by their healthcare provider, can increase morbidity, mortality and side effects, while reducing effectiveness. Through studying two respiratory conditions, asthma and tuberculosis (TB), we thoroughly review the current understanding of the measurement and reporting of medication adherence. In this paper, we identify major methodological issues in the standard ways that adherence has been conceptualised, defined and studied in asthma and TB. Between and within the two diseases there are substantial variations in adherence reporting, linked to differences in dosing intervals and treatment duration. Critically, the communicable nature of TB has resulted in dose-by-dose monitoring becoming a recommended treatment standard. Through the lens of these similarities and contrasts, we highlight contemporary shortcomings in the generalised conceptualisation of medication adherence. Furthermore, we outline elements in which knowledge could be directly transferred from one condition to the other, such as the application of large-scale cost-effective monitoring methods in TB to resource-poor settings in asthma. To develop a more robust evidence-based approach, we recommend the use of standard taxonomies detailed in the ABC taxonomy when measuring and discussing adherence. Regimen and intervention development and use should be based on sufficient evidence of the commonality and type of adherence behaviours displayed by patients with the relevant condition. A systematic approach to the measurement and reporting of adherence could improve the value and generalisability of research across all health conditions.status: publishe
    corecore